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Abstract — A processor central processing unit consumes a 
considerable amount of processing time in performing 
arithmetic operations, especially multiplication operations. 
Multiplication is one of the basic arithmetic operations and it 
requires more hardware resources and processing time than 
addition and subtraction. In fact, 9% of all the instruction in 
typical processing units is multiplication. In this paper, 
comparative study of different multipliers is done for low power 
requirement and high speed.Designing high speed and low 
power circuits with cmos technology have great importance in 
VLSI circuits.One of the efficient logics among the logic family 
is the constant delay(CD) logic style.In this paper CD logic has 
been modified and a new logic known as low power high 
speed(LP-HS) is proposed.With the help of three changes 
introduced in the CD logic style.LP-HS logic is developed which 
reduces the power delay product. 

Index Terms — Multiplier, CMOS, VLSI, power consumption 
, constant delay logic(CD logic) 


I. INTRODUCTION 

Multiplication is a fundamental function in arithmetic logic 
operations.DSP system’s computational performance is 
limited by its multiplication performance [1] and 
multiplication dominates the execution time of most DSP 
algorithms [2] therefore high-speed multiplier is much desired 
[3]. Multiplication time is still the dominant factor in 
determining the instruction cycle time of a DSP chip. With an 
increasing need for greater computing power on 
battery-operated mobile devices, design emphasis has 
changed from optimizing conventional delay time area size to 
minimizing power dissipation while still keeping the high 
performance [4] . Normally shift and add algorithm has been 
implemented to design eventhough this is not suitable for 
VLSI implementation and also from delay point ofview. 
Some of the important algorithm proposed in literature for 
VLSI implementable fast multiplication is Booth multiplier, 
arraymultiplier and Wallacetree multiplier [1]. This paper 
presents the fundamental technical aspects behind these 
approaches. 

The low power and high speed VLSI can be implemented with 
different logic style. The three important considerations for 
VLSI design are power, area and delay[5-6]. High 
performance energy efficient logic style is having vital 
importance in VLSI circuits. CMOS is the dominant 
technology which is used to construct these type of integrated 
circuits. The three most widely accepted parameters to 
measure the quality of a circuit are area, delay and power. [12] 
Advances in CMOS technology have led to improvement in 
the performance in terms of area, power or delay. There is 
always a tradeoff between those in a circuit. 
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[13] The power delay product is a figure of merit for 
comparing logic circuit technologies or families. 

Constant delay logic style is targeting at high speed 
applications. The constant delay characteristic of this logic 
style makes it suitable in implementing complicated logic 
expressions such as addition. The multipliers play a major 
role in arithmetic operations. In this paper both constant 
delay logic style as well as Low Power High Speed logic is 
analysed. [14] 

II. OBJECTIVES 

The aim of good multiplier to provide a physically compact 
high speed and low power consumption unit .Being a 
important part of arithmetic processing unit, multipliers are in 
extremely high need on its speed and low power 
consumption.By reducing the number of operations thereby 
reducing a dynamic power inturn reduce significant power 
consumption of multiplier design as which is a major part of 
total power consumption. 

III. TECHNIQUES AND FUNCTIONS 

Depending upon parameters such as latency, throughputand 
design complexity there are different techniques to perform 
binary multiplication.To sum partial products more efiicient 
parallel approach uses some sort of array or tree of full 
adders. Arraymultiplier, booth multiplier and Wallace tree 
multipliers are some of the standard approaches to have 
hardware implementation of binary multiplier which are 
suitable for VLSI implementation at CMOS level. 

To design low power, high speed circuits with CMOS 
technology have great importance in VLSI circuits.One of the 
efficient logics among the logic family is the constant 
delay(CD)logic style and other modified logic of the same is 
low power high speed(LP-HS)logic. 

3. 1 Array multiplier 

Multiplication of twobinary number can be obtained with one 
micro-operation by using a combinational circuit thatforms 
the product bit all at once. Here delay is due to time for the 
signals to propagate through the gates that forms the 
multiplication array thus achieving fast way of multiplying 
two numbers. 

In array multiplier, consider two binary numbers A and B, of 
m and n bits. There are mnsummands that are produced in 
parallel by a set of mn AND gates, n x n multiplier requires 
n(n-2) full adders, n half-adders and n 2 AND gates. Also, in 
array multiplier worst case delay would be (2n+l) td. 

Array Multiplier gives more power consumption as well as 
optimum number ofcomponents required, but delay for this 
multiplier is larger. It is less economical [7] [8] as it uses larger 
number of gates inturn area is also increased.Thus,it is a fast 
multiplier but hardware complexity is high[9]. 
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Fig 1 .Array multiplier 


3.2. Wallace tree multiplier 

A fast process for multiplication of two numbers was 
developed by Wallace [10]. In this method, a three step 
process is used to multiply two numbers; the bit products are 
formed, thebit product matrix is reduced to a two row matrix 
where sum of the row equals the sum of bitproducts, and the 
two resulting rows are summed with a fast adder to produce a 
final product. 

Three bit signals are passed to a one bit full adder (“3W”) 
which is called a three input Wallacetree circuit, and the 
output signal (sum signal) is supplied to the next stage full 
adder of thesame bit, and the carry output signal thereof is 
passed to the next stage full adder of the same noof bit, and the 
carry output signal thereof is supplied to the next stage of the 
full adder located ata one bit higher position. 

Wallace tree is a tree of carry-save adders arranged as shown 
in figure 2. A carry save adderconsists of full adders like the 
more familiar ripple adders, but the carry output from each bit 
isbrought out to form second result vector rather being than 
wired to the next most significantbit. The carry vector is 
'saved' to be combined with the sum later. In the Wallace tree 
method, the circuit layout is not easy although the speed of the 
operation is high since the circuit is quiteirregular [7]. 



Fig 2.Wallace tree multiplier 


3.3. Booth Multiplier 

Another improvement in the multiplier is by reducing the 
number of partial products generated.The Booth recording 
multiplier is one such multiplier; it scans the three bits at a 
time to reducethe number of partial products [11]. These three 
bits are: the two bit from the present pair; and athird bit from 
the high order bit of an adjacent lower order pair. After 
examining each triplet ofbits, the triplets are converted by 
Booth logic into a set of five control signals used by the 
addercells in the array to control the operations performed by 
the adder cells. 

To speed up the multiplication Booth encoding performs 
several steps of multiplicationat once. Booth’s algorithm 
takes advantage of the fact that an adder subtractor is nearly as 
fastand small as a simple adder. 

From the basics of Booth Multiplication it can be proved that 
the addition/subtractionoperation can be skipped if the 
successive bits in the multiplicand are same.If3 consecutive 
bitsare same then addition/subtraction operation can be 
skipped. Thus in most of the cases the delay associated with 
Booth Multiplication are smaller than that with Array 
Multiplier. However theperformance of Booth Multiplier for 
delay is input data dependant. In the worst case the delaywith 
booth multiplier is on per with Array Multiplier [1]. 

The method of Booth recording reduces the numbers of 
adders and hence the delayrequired to produce the partial 
sums by examining three bits at a time. The high performance 
ofboothmultiplier comes with the drawback of power 
consumption. The reason is large number ofadder cells 
required that consumes large power [11]. 


Table 1. Comparison between multipliers 


Parameter 

Array 

Multiplier 

Wallace Tree 
multiplier 

Booth’s 

Multiplier 

Operation 

Speed 

Less 

High 

Highest as the 
cycle length is as 
small as possible 

Power 

consumption 

Most 

More 

Less 

Area 

Maximum 
area as it uses 
a larger 

number of 

adders 

Maximum area 
as Wallace tree 
used to reduce 
operands 

Minimum area 
because 

adder/sub tractor 
is almost as 
small/fast as 

adder. 

Complexity 

Less complex 

More complex 

Most complex 


IV. CONSTANT DELAY LOGIC STYLE 

Due to the continuous demand of operating frequency, energy 
efficient logic style is always important in vlsi.This means that 
digital circuits needs high clock frequency to get fastest 
performance.Feed through logic(FTL)[15,16,17] is one of the 
efficient logics under c-mos dynamic domino logic.lt has low 
dynamic power consumption and lesser delay when compared 
to other dynamic logic styles [18, 19, 20]. Static cmos logic 
circuits are less efficient as compared to dynamic logic 
circuits interms of better speed and has lesser transistor 
requirement. 

To solve the problems related with the feed through logic new 
high performance logic called constant delay(cd) logic has 
been designed.lt performs with better energy efficiency 
compared to other logic styles. Complicated logic expressions 
are implemented by this high performance energy efficient 
logic style [21] 
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It exhibits a unique characteristic where the output is 
pre-evaluated before the input from the preceding stage is 
ready. Constant delay logic style which is used for high speed 
applications is shown in Fig 3. 
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Fig 5: Timing Diagram of CD Logic [21] 


Fig 3: Constant Delay Logic Style [21] 

CD logic consists of two extra blocks when compared to 
feedthrough logic. They are the timing block (TB) as well 
as the logic block (LB). Timing block consists of self reset 
technique and window adjustment technique. This 
enablesrobust logic operation with lower power consumption 
and higher speed. Logic block reduces the unwanted glitch 
andalso makes cascading CD logic feasible. The unique 
characteristic of this logic is that the output is pre-evaluated 
beforethe inputs from the preceding stage got ready. An Nmos 
pull down network is placed where the inputs are given. 
Basedon the logic which is given in the pull down network we 
will get the corresponding output. A buffer 
circuitimplemented using CD logic is shown below. The 
expanded diagram for timing block as well as logic block is 
alsoshown in the Fig4 


Predischarge mode happens when CLK is high and evaluation 
mode occurs when CLK is low. During 
predischarge mode X and Out are predischarged and 
precharged to GND and VDD respectively. During 
evaluationmode three different conditions namely contention, 
C-Q delay and D-Q delay takes place in the CD logic. 
Contentionmode happens when IN=1 for the entire evaluation 
period. During this time a direct path current flows from 
pMOS toPDN. X rises to nonzero voltage level and Out 
experiences a temporary glitch. C-Q delay (clock-out)occurs 
when INgoes to 0 before CLK transits to low. At this time X 
rises to logic 1 and Out is discharged to VDD and the delay 
ismeasured from CLK to Out. D-Q delay happens when IN 
goes to 0 after CLK transits to low. During this time Xinitially 
enters contention mode and later rises to logic 1 and the delay 
is measured from IN to Out. 



NMOS Pul Cown Netwoifc 

Fig 4: Buffer Using CD Logic [21] 

The chain of inverters is acting as the local window technique 
and the NOR gate as a self reset circuit. Length of theinverter 
chain varies according to the circuit which we have to design. 
The prime aim of the inverter chain is to providea delayed 
clock. The contention problem which is one of the 
disadvantages of the feedthrough logic is reduced with 
thehelp ofthis window adjustment. In the self reset circuit one 
of the input of the NOR gate is the intermediate outputnode X 
and the other one is the clock. The logic block is simply a 
static inverter as in the case of dynamic dominologic. Since 
the above circuit is for a buffer the NMOS pull down network 
consists of only one nMOS transistor. 

The timing diagram for constant delay logic is shown in Fig 
5. CD logic works under two modes of operation. 

i. Predischarge mode (CLK=1) 

ii. Evaluation mode (CLK=0) 


V. PROPOSED LP-HS LOGIC 

From the existing constant delay logic the Oproposed LP-HS 
logic is derived.In this logic there are three main differences 
when compared to CD -logic. There is no window adjustment 
technique.Instead if nmosevalution transistor pmos transistor 
is used.Addition of transistors M2 and M3 in parallel below 
the pull down network is the third variation.The proposed 
logic helps to reduce the power and delay which in turn 
reduces the power delay product.The circuit diagram for the 
proposed logic is shown in Fig 6 


\u 



Fig 6: Proposed LP-FIS Logic 
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The NOR gate which is behaving as the self resetting logic is 
constituted by the transistors M5,M6,M7 and M8.M5,M6,M7 
and M8 is driven by clock and the output intermediate node X 
.Transistors MO and Ml whose gates are driven by the CLK 
and the output of NOR gate are connected in series. 
Thisincreases the resistance which in turn helps reducing the 
power. M4 is acting as an evaluation transistor. Transistors 
M2 and M3 are connected in parallel and is placed down to 
the nMOS pull down network. These transistors help to 
reduce the power delay product. The gate of M2 is driven by 
the clock and M3 is at ground. IN values are given to the 
nMOS pull down network which is given according to the 
circuit which we have to design. Yhe dynamic resistance of the 
pull down network is increased by transistor M2 which in turn 
helps to reduce the power consumption. Transistors M9 and 
M10 together figures the static inverter which is used to make 
the cascading logic more feasible. 

The circuit works under two modes of operation. 

i. Precharge mode (CLK=0) 

ii. Evaluation mode (CLK=1) 

Evaluation mode happens when clock is high and precharge 
mode occurs when clock is low. When clock is low, transistor 
M4 gets ON and provides a high value at node X which in turn 
provides a low value at the output node OUT. 

When clock is high the transistor M2 gets ON and the nMOS 
pull down network is evaluated and gives the output. 

During this time the transistor MO whose gate is driven by the 
CLK is in OFF condition. Due to this the contention mode 
gets wiped out in the evaluation condition which in turn tends 
for the elimination of window adjustment technique in the 
proposed logic. One of the reasons for thepower and delay 
reduction in the circuit is the elimination ofthe window 
adjustment technique. During the evaluation mode the pull 
down network and the transistor M2 gets ONwhich provides 
high dynamic resistance which further reduces the power. 
Transistor M3 is in always ON conditionwhich offers an easy 
discharge of the value to the ground. 

VI. CONCLUSION 

Array multiplier requires more power consumption and gives 
maximum number of components required ,but delay for this 
multiplier is more than Wallace tree multiplier. Booth 
multiplier is superior in all aspects like 
delay, speed, area, complexity and power 

consumption.Therefore it can be concluded that for low 
power requirement and less delay requirement booths 
multiplier is suggested. 

The new logic called LP-HS logic is developed by 
modifying costant logic delay multipliers are designed using 
both existing as well as proposed logic. 
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